A Unix-based Speech Data Collection Platform
نویسندگان
چکیده
It is highly desirable to collect speech data from the telephone network via a digital interface. This avoids an additional A/D conversion normally required by analog telephone data collection hardware. A popular solution to this problem is the use of a T1 line which offers 24 digital phone lines. The leading T1 interface for Sun workstations is a system developed by Linkon Corporation. Using the Linkon framework, we have developed a fully-expandable, robust environment for platform-independent collection of telephone speech data. The object-oriented software libraries and intuitive GUI provide powerful tools with which even a novice user can effic iently prototype complex applications. The system is currently being deployed by the Linguistic Data Consortium to collect part of the next SWITCHBOARD Corpus.
منابع مشابه
The multimod application framework: A rapid application development tool for computer aided medicine
This paper describes a new application framework (OpenMAF) for rapid development of multimodal applications in computer-aided medicine. MAF applications are multimodal in data, in representation, and in interaction. The framework supports almost any type of biomedical data, including DICOM datasets, motion-capture recordings, or data from computer simulations (e.g. finite element modeling). The...
متن کاملapertium-cy - a collaboratively-developed free RBMT system for Welsh to English
apertium-cy (http://www.cymraeg.org.uk) is a rule-based “gisting” machine translation system forWelsh to English, with both engine and data released under the GPL.We summarise the development of apertium-cy, evaluate its output, and discuss the advantages of a collaborative development model combined with rule-based MT for marginalised languages. 1. e Apertium platform apertium-cy is a “gistin...
متن کاملCreating a Data Collection for Evaluating Rich Speech Retrieval
We describe the development of a test collection for the investigation of speech retrieval beyond identification of relevant content. This collection focuses on satisfying user information needs for queries associated with specific types of speech acts. The collection is based on an archive of the Internet video from Internet video sharing platform (blip.tv), and was provided by the MediaEval b...
متن کاملMIKE: A Distributed object-oriented programming platform on top of the Mach micro-kernel
This paper describes the architecture and implementation of MIKE a version of the IK distributed persistent object oriented programming platform built on top of the Mach microkernel MIKE s primary goal is to o er a single object oriented programming paradigm for writing distributed applications In MIKE an application programmer can use C almost as he would in a non distributed system The platfo...
متن کاملDesign and implementation of a speech server for unix based multimedia applications
In this paper we describe a general purpose speech recognition server (SRS) that provides a standard interface between applications and speech recognition modules. The recognition modules cover di erent techniques such as speaker dependent or independent, isolated or connected word recognition. The SRS is designed mainly for multimedia applications running on a network of UNIX workstations. Our...
متن کامل